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ASYMPTOTIC  THEORY  FOR  NONPARAMETRIC  CONFIDENCE  INTERVALS 

Peter  W.  Glynn 

1.  Introduction 

The  problem  of  assigning  nonpsraaetrlc  confidence  Intervals  has 
recently  been  the  focus  of  renewed  attention.  One  Impetus  has  been 
the  development  of  the  "bootstrap”  method  by  EFRON  (1979)  ss  a  general 
nonparametrlc  statistical  tool.  BICKEL  and  FREEDMAN  (1981),  as  well 
ss  SINGH  (1981),  have  shown  that  the  bootstrap's  distributional 
approximation  Is  asymptotically  valid  In  a  wide  variety  of  circum¬ 
stances,  while  EFRON  (1981)  has  studied.  In  particular,  the  boot¬ 
strap's  viability  for  setting  confidence  intervals.  The  recognition 
that  computing  power  is  Increasingly  available  has  allowed  statisti¬ 
cians  to  consider  confidence  interval  methods,  such  as  the  bootstrap, 
that  are  computationally  more  complex  but  statistically  better  behaved 
than  previous  techniques.  The  pivotal  transformation  of  JOHNSON 
(1978)  Is  another  such  procedure. 

Nonparametrlc  confidence  Interval  methodology  has  also  attracted 
considerable  study  In  the  Monte  Carlo  simulation  literature;  see  ctamw 
and  LEMOINE  (1977),  FISHMAN  (1978),  and  LAW  and  KELTON  (1982),  for 
example.  The  idea  is  to  assign  confidence  intervals  to  point 
estimators  obtained  from  a  simulation  output  sequence,  in  order  to 
give  the  simulator  an  assessment  of  the  estimates'  variability. 

The  simulation  applications  mentioned  above  dictate  that  we 
analyse  the  confidence  interval  problem  for  ratio  estimators.  To  be 
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precise,  we  shall  consider  tha  problaa  of  estimating  r  "  ttn/Exn 

from  a  sequence  of  Independent  and  Identically  distributed  (1.1. d.) 

random  vectors  (r.v.'e)  {(Tn,rn);  «  >.  1>.  where  B( |Tn|  +  |xnj)  <  • 

and  Et  0.  Of  course,  the  classical  nonparaaetrlc  situation  Is 
n 

captured  as  a  special  case,  by  setting  t,  >  1. 

The  organisation  of  this  chapter  Is  es  follows.  In  Section  2,  we 

show  that  ratio  estlaators  arise  naturally  in  tha  context  of  the  slwu- 

% 

lation  and/or  statistical  analysis  of  ergodic  quantities  associated 
with  regenerative  stochastic  processes.  Section  3  discusses  the  basic 
central  Halt  theorem  (CLT)  on  which  all  tha  confidence  Interval 
methods  to  be  considered  In  this  chapter  will  be  based.  Asymptotic 
error  analysis  of  these  techniques  requires  certain  tools  froa  the 
theory  of  Edgeworth  expansions.  In  Section  4,  results  of 
BHATTACHAKAYA  and  GHOSH  (1978)  and  GOTZE  and  HIPP  (1978)  are  extended 
to  accomodate  the  generalisations  required  by  the  ratio  estlaator 
problem. 

In  Section  3,  we  obtain  a  rigorous  Edgeworth  expansion  for  the 
ratio  estlaator  pivot  statistic.  This  extends  the  work  of  CHUNG 
(1946)  from  tha  classical  case  to  tha  ratio  problem  (the  formulas 
there  contain  soma  errors,  however;  see  WALLACE  (19S8),  p.  642).  This 
enables  us.  In  Section  6,  to  analyse  the  error  asymptotics  of  the 
ratio  pivot  confidence  interval,  as  well  as  two  related  intervals.  In 
particular,  we  are  able  to  precisaly  identify  the  effect  of  the 
Student  t-correctlon  (1.*.,  using  Student  t-quantlles  rather  than 
normal  quantiles  la  the  limit  approximation)  on  coverage. 
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In  Section  7,  we  extend  Johnson's  pivotal  transformation  to  ratio 


pivots,  and  show  that  it  corrects  for  the  asymmetry  effects  of  order 
n~l/2  (n  is  the  sample  sise)  that  occur  in  the  standard  pivot. 
Section  S  presents  a  second-order  pivotal  transformation  which 
corrects  coverage  error  in  the  standard  pivot  to  order  n“l.  It 
turns  out  that  this  second-order  pivot  is  the  nonparametric  analogue 
of  a  transformation  suggested  by  HOTELLING  and  FRANKEL  (1938)  to 
“normalise"  Student  t-variates.  Section  9  discusses  computational 
Issues  and  displays  results  of  Monte  Carlo  sampling  experiments  in 
which  the  coverage  characteristics  of  the  pivotal  transformations  were 
compared  with  those  of  the  untransformed  pivot. 


2.  Some  Applications  of  Ratio  Estimator  Confidence  Intervals 

The  possibility  of  extending  confidence  Interval  methodology  from 

the  class leal  framework  to  the  ratio  estimator  context  has  been 

previously  studied  in  the  statistical  literature.  For  example,  ROT 

and  POTTHOFF  (1938)  discuss  this  problem  in  the  case  where  (T  ,t  ) 

n  ii 

has  a  bivariate  normal  distribution.  Their  motivation  stemmed  from 
applications  in  which  a  comparison  of  ETn  and  Eta,  in  terms  of 
their  ratio,  is  desired.  For  Instance,  in  evaluating  the  effect  of  a 
treatment,  the  ratio  of  the  mean  of  the  treated  population  to  the  mean 
of  the  untreated  population  is  of  interest. 

More  recently,  this  problem  has  attracted  considerable  attention 
in  the  simulation  community.  Consider  a  measurable  regenerative 
stochastic  process  (Xtj  t  >  0)  (see  SMITH  (1955)  for  a  complete 
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discussion).  Then,  thsrs  exist  rand on  tines  Tj  <  <  •••  with 

Tq  ♦  •  such  that  the  vectors  {(7^(0,  x^);  k  >  1}  ere  l.i.d. ,  where 


Tk+1 

Y^f)  -  /  f(X8)ds 
Tk 

Tk  *  VrTk 

for  any  suitably  measurable  real-valued  function  f.  It  can  be  shown 
(see  [13J)  that  if  E(Yn(|f|)  +  *n)  <  -,  then 


t  • 

/  f(X  )ds/t  ♦  r(f)  s  BY  (f )/Ex  a.s. 

is  n  n 


Hence,  development  of  confidence  Intervals  for  ergodic  quantities  such 
as  r(f),  in  the  context  of  regenerative  processes,  leads  naturally  to 
the  study  of  ratio  estimators.  For  a  complete  discussion  of  the 
simulation  issues  related  to  regenerative  ratio  estimates,  wm  refer 
the  reader  to  IGLEHABT  (1978),  and  Chapter  6  of  BDBINSTEIN  (1981). 

Of  course,  it  is  clear  that  the  regenerative  approach  is  equally 
applicable  to  the  nonparametrlc  analysis  of  statistical  data  modelled 
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3.  OjafUmet  Intervals  for  Ratio  Estimators 


For  the  remainder  of  this  chapter,  we  assume  that  E< |Tn|+|‘tn|)<*t 
2 

Etn  #  0,  and  0  <  a  (Z^)  <  «,  where  ^  \  "  r\»  Also,  without 

lose  of  generality,  we  assuae  that  Etq  >  0  (otherwise,  we  peae  to 

(-Y  ,  —c  )).  For  a  generic  sequence  {tu ;  1  >  1}  of  i.l.d.  r.v's, 
u  n  i 

we  shall  use  the  notation  tj^/k,  and  ■  k  (t^-Et^). 

The  r.v. 's 


r.  -  1  _  r  h 

<V°> 


V  ^  tI,  'W/ 


1/2  (r.-r)  - 


t  -  I  _  n”  ~  — frs-  X 


(we  Interpret  a  product  Involving  an  indicator  to  be  zero  If  the 
indicator  la  sero)  play  an  Import ant  role  in  ratio  estimator 
confidence  intervals.  To  be  precise,  it  is  not  difficult  to  show  that 
rn  ♦  r  a.s.  and  that 


x 

(3.1)  F  (x)  *  *{*„  <*)  *  f  ♦OOdu  *  *(*) 

n  n  —  # 
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where  4(u)  ■  (2«)  *^2  exp(-u2/2).  The  CLT  (3*1)  proves  that 
tLQ(p),  R.Q(p)]  (0  <  p  <«)  Is  an  approximate  100  (l-e)Z  confidence 

interval  for  r,  share 


Ln(p)  "  rn  "  va<IH'1-a> 


x<v°> 


-  v  (p)I 


{xn<0> 


(3.2) 


Hn(p) 


rn  '  v»(p) 


{xQ>0> 


vn(p+l-«) 


\xn<0) 


vn(p)  -  S(p)  v*/2/<n1/2  \  ) 

and  z( p)  •  In  order  to  study  tha  error  asymptotics  of  the 

above  intervals,  we  introduce  the  error  descriptors 


e*(p)  -  P{r  <  1-0(p)}  -  (o-p) 

(3.3)  e*(p)  -  P{r  >  R  (p)>  -  p 

ea  U 

eQ(p)  -  P{Ln<p)  <  r  <  Rn(p)>  -  (1-a)  . 

The  tern  e„(p)  measures  the  coverage  probability  error  in  the 
Interval  ll>a(p),  RQ(p) ] ,  whereas  the  terns  e*(p),  e*(p)  provide  the 
one-sided  coverage  errors.  The  one-sided  errors  will  assist  us  in 
evaluating  the  degree  to  which  the  above  nonparaaatrlc  confidence 
Interval  captures  the  asyanetry  which  is  present  in  parametric 
confidence  intervals  (sac  (9]  for  a  discussion  of  this  point). 
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In  analogy  with  the  classical  nonparametrlc  case,  two  other 

* 

Intervals  are  natural  to  consider.  Let  t  be  the  pivot  obtained 

n 

froe  t  by  replacing  v  with  v*,  where  v*  •  ((n-l)/n)v  . 

u  u  u  u  a 

Intervals  lL*(p),  H*(p)l,  with  errors  e*(p)*,  en<P)** 

are  defined  analogously  to  (3.2)  and  (3.3). 

A  second  alternative  la  to  use  the  Student  t-dlstrlbutlon  with  k 
degrees  of  freedoe.  Let  Sfe(p)  be  the  p'th  quantile  of  such  a 
distribution.  Then,  xjt(p)  r(p)  as  k  ♦  •  (PEISER  (1943)),  and 
thus,  in  light  of  the  parametric  theory  for  the  bivariate  no  real  case. 
It  Is  of  Interest  to  consider  the  Intervals,  [L^(p),  R^(p) ] ,  with 
errors  t*(p) ' ,  c^(p) ' ,  and  eQ(p) ' »  constructed  by  substituting 
sQ_1(p)  for  z(p)  In  vQ(p). 

Before  concluding  this  section,  we  observe  that  If  E|tn|  <  • 
for  s  >  0,  Chebyshev's  Inequality  implies  that 

(3.4)  P{?  <  0>  <  B(t  >*/(nl/2  Et, )k 

n  —  —  n  x 


where  k  Is  an  even  integer  lying  in  the  Interval  (s,  s+2] .  It  Is 
easily  verified  algebraically  that  E(tQ)  remains  bounded,  so 
P{r  <  0)  -  o(n"*/Z).  Then, 

U  “ 


ej(p)  -  -G  (s(p+l-e))  +  o(n'i/2) 

a  u 

e*(p)  -  C  (s(p))  +  o(n”*^2) 
n  n 

e  (p)  -  C  (s(p+l-«))  -  G  (s(p))  +  o(n”*^2)  , 

n  n  n 
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where  Gq(x)  •  Fq(x)  -$(x),  provided  K|-cnj  <  •.  Analogous 
expressions  hold  for  the  errors  for  the  other  confidence  Intervals. 
Thus ,  the  discussion  of  confidence  interval  error  leads  to  study  of 
asymptotic  expansions  for  Gq(x)  . 


4.  Edgeworth  Expansions  for  Smooth  Statistics 

BHATTACHARAYA  and  GHOSH  (1978)  have  recently,  shown  that  the 
“delta  wet hod"  for  deriving  Edgeworth  expansions  is  rigorously  correct 
for  a  wide  class  of  statistics.  To  be  precise,  suppose  that 
{Vn:  n  >  0)  is  a  sequence  of  1.1. d.  m- dimensional  r.v.'s,  and  let 
f  ^ ,  . . . ,  fy  be  real-valued  Borel  measurable  functions  on  8* .  Put 

°i  *  <W’  •••»  fk(Vi)) 

p  -  EDt 

and  let  Hq,  Hp  ...,  be  real-valued  functions  on  Rk  such  that 
is  continuously  differentiable  of  order  s+2-i  on  a  neighborhood 
of  p.  The  objective  is  to  establish  an  asymptotic  expansion  for  the 
distribution  of 

(4.1)  An  -  nl/2(  ?  n-k/Z[Hk(Dn)  -  Hk(p)Jcn)  +  ao 


where 


and 


a 

I 

k-1 


-k/2 


*k,« 


o(n~#/2) 


1  + 


8+1 

l 

k-2 


-k/2 


*16,1 


+  0(n 


-(a+l)/2) 


are  sequences  of  deterministic  constants.  The  form  of  n  aq 

outside  a  neighborhood  of  can  be  taken  as  an  arbitrary  real-valued 

measurable  function  of  U  (the  constants  a  ,  c  add  a  flexibility 

xi  n  n 

which  will  be  necessary  later;  see  (7.2)  and  (8.3),  for  example). 

We  shall  henceforth  assume,  in  our  study  of  Aq,  that  the 
covariance  matrix  £  associated  with  U  is  non-singular.  This  can 
be  done,  without  loss  of  generality,  by  replacing  1,  f^,  ...,  by 
a  maximal  (in  terms  of  number  of  elements)  collection  1,  f^,  ..., 
of  functions  linearly  independent  as  elements  of  the  L2  space  of 
r.v. 's  (see  [1],  p.  442,  for  details). 

The  "delta  method"  begins  by  expanding  H^(u)  to  (s+l-l) 
terms,  as  a  Taylor  series  about  u  ■  p.  This  yields  a  polynomial 
Ha,i(u)  of  degree  (s+l-l)  and  gives  rise  to  a  differential 
approximation  aS|Q  of  An  as  follows: 


(4.2)  *  n  “  nl/2(  l  k<0„>  -  v(u)lcn)  +  an 

s,n  kto  *’k  n  8’k  n  n 


This  can  be  re-written  in  the  form 


(4.3) 


_(0) 
,n  n 


£  n"k/2  P  k(U  )  +  o(n",/2) 
k-0  •’*  n 
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(4.7) 


ur(-d/dv)]  $(v/o)  dv/a 


4*8  n<v)dv  -  [1  +  l  n~r/2 
*  r-1 

which  is  the  formal  Edgeworth  expansion  of  the  distribution  of  aq. 

(4.8)  THEOREM  1)  Suppose  that  EIOqI^  <  <*.  Then, 

(4.9)  P{an  <  x}  -  /  4(v/o)  dv/a  +  o(n  *^2) 

where  o(n~l/2)  is  uniform  in  x. 

ii)  Suppose  that  EIU  1 8+2  <  •,  and  that  U,  +  •••  +  U.  has  a 

n  IX 

non-zero  Lebesgue  density  component  (in  R^)  for  some  X.  Then, 

(4.10)  P{a  €  B}  ■  /  4  (v)dv  +  o(n“*/Z) 

B  ’ 

— s/2 

where  o(n  )  is  uniform  over  all  Borel  sets  B.  The  function 
n  can  be  calculated  via  the  "delta  method"  (4.2)  through  (4.7). 

Although  the  proof  given  in  [1]  restricts  its  attention  to  the 
case  where  an  *  0,  Cn  «  1,  and  =  0  for  k  1,  the 
argument  readily  extends  to  the  more  general  situation  considered 
here,  the  only  complication  being  additional  notational  complexity. 

In  some  related  work,  CHIBISOV  (1972)  proved  Theorem  4.8  (li)  in 
the  case  where  Tn  was  of  the  polynomial  form  (4.3)  (no  identifica¬ 
tion  of  as  the  expansion  obtained  via  the  "delta  method"  was 

made,  however).  An  extension  to  the  general  non-polynomial  case  was 
effected  via  the  following  "perturbation"  theorem  (see  [4],  p.  629). 
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(4.11)  THEOREM.  Suppose  that  a^  ■  *n  +  n  ^  where  Aft 

l/2 

satisfies  the  assumptions  of  Theorem  4.8(11)  and  ?{|Xq|  >  »  '  Pn) 

■  o(n  )  for  some  sequence  p#  ♦  0.  Then, 

P{a^  <  *}  -  /*  4>-  a(^)dT  +  o(n"*/2) 
where  o(n ~*/2)  i*  uniform  In  z. 

It  Is  clear  that  the  4>B>n  of  Theorem  4.11  must  be  that 
obtained  via  the  "delta  method”. 

We  remark  that  the  density  assumption  on  the  U^'a  in  Theorem 
4.8(11)  follows  If  Vi  has  a  Lebesgue  density  component  which  Is 
positive  on  an  open  set  where  1,  f ^ ,  ...»  f^  are  linearly  Indepen¬ 
dent  as  continuous  functions  (see  Lemma  2.2,  [1]).  Also,  we  note  that 
the  moment  conditions  in  Theorem  4.8  are  norm  independent,  due  to  the 
fact  that  all  norms  on  finite-dimensional  spaces  are  equivalent. 

Because  of  the  potentially  large  number  of  derivatives  required 
in  calculating  the  expansions  fc(u) ,  it  is  convenient  to  consider 
a  modification  of  the  "delta  method".  Towards  this  end,  let  9(j ;u) 
be  polynomials  of  degree  p(s-fl),  and  set 

-  1  »'J/2  9<li  V  • 
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(4.12)  PROPOSITIOH.  Suppose  that  1  -  ap  -  8  satisfies 

n  q 

n*^2  1  +  0  in  probability  (l.a. ,  RQ  ■  o^n"*^)).  than.  If 

El V  l(**1)p  <  -,  we  bass  E  ap  -  E0  +  o(n“*/2). 
n  s,n  u 

Proof .  The  remainder  Sq  can  be  written  in  the  fore 

*.  -  1  “ ~1/2  V 

j-0 

where  R(j;u)  are  polynomials  in  u  of  degree  p(s+l).  He  now  show 
that  R(j;u)  vanishes  for  J  ^  s.  Starting  with  j-0,  observe  that 

R<0;  0Q)  — >  R(0;  N) 

(”">  denotes  weak  convergence) ,  where  N  is  a  multivariate  normal 

r.v.  with  non-singular  covariance  matrix  1.  Evidently,  since 

n8^2  R  ■  o  (1)  and  R„  -  R(0;  U  )  «  o  (1),  it  must  be  that  R(0;N) 
n  p  □  n  p 

is  degenerate  at  0.  On  the  other  hand,  if  R(0;u)  depends  non- 
trlvially  on  u*  (say),  then  the  Jacobian  of  the  transformation 
u-*  (Uj,  ....  u^,  R(0;u),  ui+1,  •••»  Ujj)  Is  non-singular,  and  thus 
it  follows  that  R(0;M)  has  a  Lebesgue  density.  This  contradiction 
forces  R(0;u)  to  vanish  identically.  Repeating  the  argument  s 
more  times  proves  that 

*_  -  l  n"J/2  R(J;  U  )  . 

n  j-i+l  n 
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Under  the  none at  conditions  given  here,  ER(J;  UQ)  ♦  ER(J;N)  end 

•g/2 

consequently  ERq  "  o(n  ),  proving  our  result.  I 

Our  final  goal  In  this  section  is  to  show  that  the  Edgeworth 
expansion  (4.S)  remains  valid,  in  a  certain  sense,  when  the  density 
assumption  on  the  distribution  of  U  is  dropped.  Let  C^OO  be  the 
clase  of  «il  bounded  infinitely  differentiable  functions  and  take 
C*(R)  -  {f  :  Dnf  e  c£(R),  for  all  n}  (D  -  d/dx).  The  class  C*(R) 
includes  the  trigonometric  functions  sin(tx),  cos(tx),  as  well  as  the 
Schwarts  class  S  (see  1HATTACHARATA  and  RAO  (1976),  p.  257).  We 
first  need  the  following  proposition. 

(4.13)  PROPOSITION .  (1)  Let  f  c  <£(R),  and  euppoee  a  is  a 
multiindex  (l.e. ,  a  non-negative  integral  vector)  with  |a|  3  Ski 

<  mt-2.  Them,  if  HU  «**2  <  •,  therm  exists  a  mltlvarlate  Edgeworth 
expansion  of  the  dlstrlbotlon  U  such  that 

(4.14)  K(S“  f(p4))  -  /  u*  f(p»n)  E  (< u)dn  +  o(n““/2) 

n  q  ■fi* 

holds,  for  any  vector  p. 

(11)  If  KID^i  <  -,  then  the  ehange-of-varlahles  formula 

(4.15)  /  f(Am  n(u»  -  /  «7>  V»(y)dy  +  0<tt"*/2) 

holds  for  all  bowndad  measurable  f. 
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Proof.  (1)  We  use  Theorem  3.6  of  (14],  end  observe  Chet  boundedness 
of  the  derlvstives  of  f  implies  thst 


DP(u“f>  -  0(lulnri'2) 

(D0  ■  •••  where  Dj  ■  &/&Xj)  for  ell  oultiindlees  P  with 

sf*2 

jpj  <  a.  This  is  sufficient  for  (1),  in  the  presence  of  EIUnl  <  •. 

(ii)  Lemae  2.1  of  (1]  proves  thst  (4.15)  holds  uniformly  over 
ell  lndlcetor  functions  f.  For  en  arbitrary  bounded  f,  approximate 
f  by  a  finite  linear  combination  of  indicators  f j  >n  such  that 

|f(x)  -  1  dj  n  fJ  a(x)|  <  I-"  for  aU  x  . 

Than,  letting  4(f)  be  Che  difference  between  the  two  Integrals  in 
(4.15),  and  using  the  Hahn  decomposition  on  the  signed  measure  4(0, 
shows  that 

|««|  <  *"*(/  +  !  IVn(j,)ldj,)  +  l“j,. 

<  <X2~°>  +  «ip|f(«)|  .  , 

where  the  uniformity  over  indicators  is  used  in  the  final  step.  I 
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Our  next  theorem  shows  thst  Thsorea  4.8  continues  to  hold,  la 
expectation,  when  the  density  assumption  on  U  Is  deleted.  Us  lapose 
rather  strong  assumptions  on  U,  and  the  class  of  test  functions  f 
allowed,'  In  order  to  simplify  the  exposition. 

(4.16)  THEOREM.  Suppose  that  1%  has  finite  moments  of  all 
orders.  Then,  for  any  s, 

(4.17)  Ef(An)  -  /  f(y)  ♦,  B(y>ay  +  o(n“*/2) 

for  all  f  e  <£(R).  The  function  ^  n  can  be  Identified  through  the 
"delta  method." 

Proof.  First,  observe  that  for  any  t  >  0,  there  exists  K  >  0  such 
that 

(4.18)  P{IOa-vl  >  e)  <  P{IDnl  >  K  Jta  n)  . 

The  probability  on  the  right-hand  side  of  (4.18)  is  o(n"’*/2)  (see 
Corollary  17.12  of  {2}),-  and  hence 

Ef(Tn)  -  E(f(Tn);  IOn-pl  <  e>  +  o(n~*/2)  . 

/  g+2-k) 

Choose  c  sufficiently  small  that  Dv  '  H^(u)  Is  continuous  on 
an  e -neighborhood  of  p  for  all  k.  Expanding  aq  on  (IUn-pl  <  c) 
yields 
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(4.19)  a  -  a  +  n"(,+1)/2  £  cn<On*’)*+2‘k  n)/(«+2-k)! 

n  «,n  k-0  n  n  ^  **“ 

where  7  -  (Dj ,  ...»  B^)  end  Ir^  n~jil  <  e.  Note  that  ^(t^  Q)  ere 
bounded  r.v. ’a  in  (4.19). 


Thus,  we  can  write  f(T&)  on  {IUa~vil  <  e)  es 


f(A)  -  l  (a  -  P  (U  ))k  (Dkf)  (Pa  (0  ))/k! 
n  fcaiO  ^  otn  n  o,n  n 


♦  <a„  -  p«  •<®«))"fl  a>*+1«)(tij/(»+D« 

u  Ota  a  u 


which  evidently  can  be  re-written  es 


f(*a) 


Jt 

-,1c 


,-J/2 


r.  (U; 
j  *®  o 


\,n{  V 


For  j  <  i,  r4  .  baa  the  for*  U_  g(p»0  )  (g  e  C*(R))  whereas  for 
"  jf»  n  n 

J  >  s,  rjta  is  the  product  of  functions  of  this  for*  with  bounded 
functions.  Thus,  for  J  <  s. 


t(rj ‘V*'  <  •> 

■  +  o(n“*^2)  -  /  rjn(n)  C#B(u)du  +  o(n““^2)  , 

the  first  equality  by  uaifor*  lntegrsbllity  of  (rj|B),  the  second 
by  Proposition  4.13(1).  A  slallar  arguaent  for  j  >  s  shows  that 
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E{  f  n"j/2  I\  :  10  “ill  <  e>  -  o(n”*/2) 
j-s+1  J *n  n 


and  hence 


Ef ( a  )  -  /  l  ni/2  r.  n(u)  5  (u)du  +  o(n",/2) 
n  j“0  J,n  e,n 

-  /  A t  n(u)  5a  n(u)du  +  o(n  *^2)  . 

Applying  Proposition  4.13(11)  completes  the  proof  of  (4.17).  The 
identification  of  <|>s,n(y)  **  that  derived  from  the  “delta  aethod“ 

follows  from  a  proof  Identical  to  that  found  on  pages  445-6  of  [1].  I 

We  remark  that  an  immediate  consequence  of  Theorem  4.16  is  that, 
under  the  assumptions  stated,  the  characteristic  function  of  An 
can  be  expanded  as 

E  exp(lt  a  )  -  6  (it)  +  o(n”*^2)  . 

U  I  |U 


5.  Edgeworth  Expansions  for  Ratio  Estimator  Pivots 

As  Section  4  illustrates,  the  key  to  obtaining  Edgeworth  expan¬ 
sions  is  the  calculation  of  cumulants  (see  (4.4))  of  the  differential 
approximation  a(|Q.  The  required  moments  will  be  derived  from 
Proposition  4.12.  In  this  section,  we  will  calculate  Edgeworth 
expensions  for  tQ  and  t*  to  order  n”1.  This  represents  a 


18 


different  approach  froa  that  of  GEARY  (1947)  and  GAYER  (1949),  who 
foraally  expanded  the  dletrlbutlon  of  the  pivot  tn  (for  the 
elaseleal  ease  where  xn  =  1)  In  Charllar-typa  series. 

* 

We  start  by  observing  that  the  pivotal  quantities  tQ  and  tQ 
are  Invariant  to  the  transformation  (Y^,r^)  (aY^,  ert^)  ■  (Y^.xj,) 

for  a  *  0.  In  particular,  by  taking  a  ”  l/a(Z),  we  can  assume 
throughout  our  calculations,  via  a  passage  to  (Y^,t|),  that 
o(Z)  •  1.  However,  In  stating  our  final  conclusions,  dependence  on 
o(Z)  will  be  aade  explicit. 

* 

Our  first  order  of  business  Is  to  expand  tn  and 

* 

differential-type  approxlaatlons  t 2  Q  and  t2 

2 

and  let  fj(v)  -  Vj,  f2(v)  -  v2,  fj(v)  •  Vj,  f^(v)  -  VjVj,  fj(v)  “  v 
Observe  that  rn-r  •  Zjj/xn  and  that  I{v,j>0,xns0}  identically  1 
on  a  neighborhood  of  p«  Thus, 


S.t  Vk  -  (V 


'*  -  *„(>  <v»  + 1  <V‘>2  +  V*’1’ ) 

where 


(5.2) 


2 


%1-1 


♦  ^  <  v*> 


and  -  Z^-l.  Expanding  l/xQ  In  a  Taylor  series  about  1/Et,  we 
obtain 
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<5.3) 


20 


(5*6)  THEOREM  (i)  Let  {A^;  n  >  0>  be  a  sequence  of  l.l.d.  r.w'a 
with  EAk  <  •  for  all  k.  Then, 


(a)  E il  -  n”2  eL3 


(b)  EA*  -  3(EaJ)2  +  n“l/2  si}  -  2n"1/2(KA^)2 


(c)  EA3  -  10  n"1/2<EAj)(KAj)  +  (Xu-1) 


(d)  EA*  -  15<EaJ)3  +  0(n~1/2) 


(e)  ea; 


i**1  -  of«“1/2 


<Kn  x/‘>  . 


<11)  Sappoae  that  {UbiBb);  a  >  1}  la  a  aeqaence  of  l.l.d. 
k  k 

r.v.'a  with  EA  <  m,  e*  <  •  for  all  k.  Then, 
n  n 


«>  «$.  -  ■$, 

(g)  EA3^  -  3<EA1i1)(EA2}  +  0(a_l/2) 

<h)  EAji^  -  *  n“l/2(BA3)  E^lj  +  6o”l/2(E^»l)  EaJ  +  0U_1> 
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(i)  BA*B2  -  3<KaJ)<IbJ)2  +  msijBj)2  dJ  +  0<n~1/2) 

Cj)  EA*Bn  -  15(81^)  (eIJ)2  +  0(n“1/2) 

t 

(k)  bIHJ  -  <Xn-1/2)  If  ktj  la  odd  . 

n  n 

Proof.  For  (1) ,  observe  thet  exchangeability  of  the  sequence 
{Aq:  n  )►  0}  provides  a  recursion  In  k,  namely 

E(  l  »i)k+1  -  I(  !  AJ)  (  |  *1)“ 

1-1  ^  1-1  1  1-1  ^ 

■  °t(1»  <j,  *;,k) 

-■  I  (k)  «(.;)'«  e(°J‘ 

i-i  1  1  j-i  3 

where  A^  -  A^-EA^.  Solving  the  recursion  with  Initial  condition 

EA'  -  0  proves  (1).  For  (11),  apply  (1)  to  A  (s,t)  -  sA  +  tB  . 
n  nan 

Both  sides  of  equations  (a)  through  (e)  are  then  polynomials  In 
(s  ,t) .  Identifying  coefficients  yields  rssults  (f)  through  (k).  I 
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Now,  let  0  -  EZj/o3(Z1),  X  -  <EzJ/o4(Zl))  -  3,  and 
y  ■  EZ3't1/(o2(Z1)Et1).  Then, 

EZj  ^  »  0  ,  EZ2  Wx  -  X+2 

(5.7) 

EZj  Qj  "  6  +  2a2-2y  ,  EViJ  -  X+2 

In  view  of  Proposition  4.12,  (5.5)  through  (5.7)  provide  the 
asymptotic  expansion 

Et*  n  "  («-P/2)n_1/2  +  o(n-1)  . 

Similar  reasoning  on  the  higher  moments  proves  that 

E(t*  )2  -  1  +  (6a2  +  6y  +  202  -  36  -  lOa0)n“1  +  o(n-1) 

E(t*  J3  "  <9a  -  70/2)n"1/2  +  o(n"1) 

E(t*  )4  -  3  +  (120  a2  +  60  y  +  28  02  -  30  6  -  140  a0  -  2\  -  6)n 

4,n 

+  o(n  1)  . 

The  relevant  expansions  for  ta  may  be  easily  obtained  by 

*  —2 

using  the  relation  tQ  “  tQ(l  -  l/2n  +  0(n  )).  Consequently, 
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up  to  terms  of  order  o(n~l). 

Note  that  in  the  classical  case  where  tn  =  1,  we  have  a  m  0 

and  y  ■  5  *  1.  The  moment  formulas  for  tn,  when  appropriately 

simplified,  are  then  in  agreement  with  those  found  in  [11]  and  [12]. 

It  should  also  be  noted  that  in  the  classical  case,  the  approximate 

skewness  E(t,  )3  ((E(t*  )3)  of  t  (t*)  is  -70/2.  This  verifies 

z,n  z,n  n  u 

the  empirical  observation  that  positive  skewness  in  the  distribution 

of  leads  to  negative  skewness  in  the  pivots  tQ  and  t*  (see 

SOPHISTER  (1928),  and  NEYMAN  and  PEARSON  (1928)). 

The  appropriate  cumulants  k ,  of  t  are  given  by 

j  »n  n 


(5.8) 

where 


bl,l  “a-P/2 

b2,0  “  1 

b2  2  *  +  5°2  +  702/4  -  36  -  9a0  -  1 

b3,l  “  60  -  2P 

b4  2  -  24y  +  60  a2  +  12  02  -  12  6  -  60  cx0  -  Z\  -  6 
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*  * 

and  all  other  b.  .  are  zero.  For  t  ,  the  b.  ■  of  the 
j ,k  n  j 

* 

corresponding  cumulants  tc.  satisfy  b.  .  ■  b.  .  ,  excepting  that 

J »°  J  tK  1 

b2,2  "  b2,2+1* 

The  following  distribution  function  7,  (x)  is  obtained  from 

ie  in  the  same  way  as  passage  was  made  from  (4.4)  to  (4.7): 
j 


(5.9)  7,  (x)  -  ®(x)  -  b.  .  $(*)/° 

2,n  1,1 


-  (b2  2  +  bj  1)x  $(x)/2n 


+  b3^(l-x2)  *(x)/6n1/2 


*  fl>4  2  +  4bi  j  b3  j)  (x  *3x)  4(x)/24n 


-  b2  X(x5  -  10x3  +  15x)  $(x)/72n 


Also,  let  79  (x)  be  the  function  obtained  from  (5.9)  by 

i  ,n 

* 

substituting  bj  ^  in  place  of  bj 

(5.10)  THEOBE31  (i)  If  ECT6  +  t®)  <  then 


P{tB  <  x)  -  *<x)  +  0(n_1/2) 


where  0(n~^/2)  ^  uniform  in  x. 
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(II)  Suppose  that  B(  |Tn jk  +  Jtn|k)  <  •  for  all  k.  Then, 

Bf(tn)  -  /  f(y)  fj  ^)  +  o(n  1) 
for  all  f  c  <£(*>. 

(III)  Suppose  E(ff  +  t®)  <  •,  and  that  (Y. ,  x.)  has  a 

on  xx 

distribution  with  a  Lebesgue  dens ' 'j  component  which  Is  positive  on 
some  open  set  In  the  plane.  Then, 

P{t  e  B)  -  /  Y,  (dy)  +  o(n_1) 

“  i  z.n 

where  o(n-1)  Is  uniform  over  all  Borel  sets  B. 

(Iv)  Suppose  x  =  1.  Then,  If  EY®  <  •,  and  If  the 

Q  O 

distribution  of  T„  has  a  Lebesgue  density  component  which  Is 
positive  on  same  interval,  the  analogue  of  (111)  above  bolds.  The 
function  ?2,n  **  obtained  from  (5.9)  by  formal  substitution. 

(v)  Results  (1)  to  (iv)  are  valid  for 
assumptions  as  for  t  ,  provided  that  T, 

O  &)D 

of  *2,n- 

Proof .  The  functions  f i ,  ...,  f5,  being  distinct  polynomials, 
are  linearly  Independent  so  Theorems  4.8  and  4,16  can  be  applied. 


* 

t  under  the  sane 
is  substituted  In  place 
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yielding  (1)  Co  (Hi).  Part  (iv)  la  handled  as  a  special  case,  by 

2 

setting  ■  Y^,  and  f^(v)  -  v,  f^(v)  ■  v  ,  and  applying  the  sane 
argument  as  for  (ill) .  I 

As  previously  mentioned,  a  particularly  Important  application  of 
ratio  estimation  lies  in  the  domain  of  ergodic  analysis  of 
regenerative  stochastic  processes.  It  frequently  occurs  that  the 
regenerative  sequence  { (Y^,  x^;  i  >_  1)  constructed  is  such  that  Y^ 
has  a  Lebesgue  density  component,  whereas  x*  is  a  lattice  r.v. 

For  example,  this  is  the  case  that  arises  when  {Xt;  t  ^  0}  is  a 
continuous  time  process  constructed  from  a  discrete  time  regenerative 
process  { X^}  via  the  formula 

x‘ "  i, r<»<.wo  • 

Our  next  result  addresses  this  class  of  processes. 

(5.11)  THEOREM.  Suppose  that  K(T6+x6)  <  -  and  that  Z.  has  a 

n  n  x 

distribution  with  a  Lebesgue  density  component  which  is  positive  on  an 
Interval.  Then, 

P{ta  <  x>  -  *ltB(x)  +  o(n"l/2> 

P{t*  <  x}  -  *lfnU>  +  o(n"l/2) 
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uniformly  la  i,  «hm  ¥,  .(»)  Is  obtained  froa  f,  (x)  by 
deleting  tana  with  coefficient  u“l. 


Proof.  The  pivot  tn  can  be  expended  es 


‘  *  *2 

«-12>  V'lOO.VOlIVyii*?!1  (,  *«i))  +  Io/6, 


where  0(l/n)  is  deterministic  end 


*»  -  ^O.yOl'V''’  ««•> 

and  H  corresponds  to  tn  vie  (4.1).  Now,  observe  that  Theorea 
4.8(ii) ,  with  a  -  1,  is  applicable  to  the  first  term  in  (5.12). 

Select  e  saall  enough  so  that  (DTH)(u)  is  bounded  for 

lu-pl  <  e  for  ell  aultlindices  x  with  |tj  -  3.  Let 
1/2 

pQ  -  K(in  n/n)  for  K  to  be  chosen  later.  Then, 

P{lXnl  >  pn  n1/2} 

<  P(lxn»  >  Pn  n1/2;  «5n-ul  <  e>  +  P{IOn-m  >  e) 

<  *Oxn*  >  etH  in  n)  +  ?{ICnl  >  K  In  n)  . 

Choose  R  sufficiently  saall  so  that  (4.18)  applies.  Thus, 
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•••a-ns 


JMUWU. 


M'x.'  > 

and  hence  Theorem  4.11  Implies 

* 

argument  works  for  tQ.  I 

6.  Applications  to  Ratio  Estimator  Confidence  Interval  Estimation 

In  this  section,  we  apply  the  Edgeworth  expansions  of  Section  S  to 
analysis  of  nonparametric  ratio  estimator  confidence  intervals. 

(6.1)  THEOREM  (i)  Suppose  E(lf  +  t6)  <  «.  Then,  e^p),  e*(p),  and 

HQ  hh 

—1/2  * 
e  (p)  are  all  0(n  ) ,  uniformly  In  p. 

(11)  Under  the  asaumptlona  of  Theorem  5.11, 

e*(p)  -  p  +  1  -  *  -  Tj^n(x(p«-l-«))  +  o(n  *^2) 
e'( P)  -  *ltB<*<P»  -  ?  ♦  o(n-1/2) 

e  (p)  -  ¥,  (*(p+-l-«))  -  ¥.  (*(p))  ~  (l-«)  +  o(n-1^2) 

n  i,n 

uniformly  In  p. 

(ill)  Under  either  assumptions  (ill)  or  (lv)  of  Theorem  5.10, 
e*(p)  ■p  +  l-  «-f.  (*(p+l-a))  +  o(n_1) 

1ft  ft|i| 


n1/2>  -  o(n-1/2)  , 


the  result  for  t  •  Precisely  the  ss 
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c>)  -  *2#11<*<P)>  -  P  +  o(«“l) 

c.(p)  -  ¥2  a(*(|H-l-«))  -  T2  n(.(p))  -  (1-e)  +  «Kn_1) 

t 

uniformly  In  p. 

(It)  Results  (1)  to  (111)  an  valid  for 

e _(p)*  ondar  the  mm  assumptions  as  for  tha 
n 

Ti  la  anbstltutad  In  placs  of  T,  . 

Proof.  The  results  follow  Immediately  from  (3.5),  and  Theorems  5.10 
and  5.11.  I 

These  expressions  show  that  under  reasonable  assumptions  the 

“1/2 

coverage  errors  eQ(p) ,  en(p)*  are  0(n  )  for  p  #  a/2,  whereas 

for  p  “  a/2,  the  coverage  errors  are  0(n  *).  Thus,  using  confidence 
intervals  based  on  p  -  a/2  leads  to  intervals  that  are  asymptotical¬ 
ly  optimal  in  the  sense  of  having  shortest  possible  length  and  most 

accurate  coverage  rate.  However,  It  Is  Important  to  realise  that  the 

-1/2 

one-sided  coverage  errors  are  0(n  )  for  all  p,  including 

p  -  a/2.  Hence  It  must  be  that  [L  j  (a/2),  R  (a/2)]  (similarly  for 

u  Q 

(L  (e/2),  R*(a/2)1)  achieves  coverage  error  of  0(n-1)  via 
n  n 

•1/2 

cancellation  of  one-sided  errors  of  order  0(n  ).  This  suggests 

that  a  "corrected  interval".  In  the  sense  of  one-sided  error,  can  be 
obtained  by  shifting  the  interval  slightly.  This  is  in  agreement  with 
parametric  confidence  Interval  theory,  where  Intervals  tend  to  be 


«i(p)*,  «'<P>*,  and 
t  errors,  provided 

II 
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asymmetric  about  tha  point  estimate.  We  shall  examine  this  question 
further  in  Section  7. 

The  coverage  errors  ea(a/2)  and  eQ(a/2)*  are  given  by 

*a<«/2>  *  "<b2.2  +  bl,l)x« 

-(b*,2  +  “l.l  b3,l>  ♦<S>/12” 

-b3,l<‘«  '  “>««  +  **■«> 

e  (a/2)*  -  e  (a/2)  -  x  $(x  )/n 
n  n  a  a 

• 

where  x  -  z(l  -  a/2).  Recalling  the  definition  of  the  b.  . 's  (see 
a  J  ,x 

(5.7))  we  see  that  eQ(a/2)  and  CQ(a/2)*  have  a  tendency  to  be 

2 

negative,  particularly  if  the  Z^'s  are  highly  skewed  (l.e.,  p  is 
large).  This  tendency  for  nonparametric  confidence  Intervals  to 
undercover  has  been  exhibited  empirically;  see  IGLEHART  (1975),  for 
example.  The  procedure  of  Section  8  will  attempt  to  deal  with  this 
coverage  rate  problem. 

Note  that  the  tn  coverage  error  is  always  biased  upwards  from 

that  of  tQ  by  an  amount  z(l-a/2)  p(z(l-a/2))/n.  This  is  an 

* 

attractive  property  of  t  ,  in  comparison  to  t  ,  in  view  of  the 

n  o 

undercoverage  mentioned  above.  The  cost  associated  with  using  tn, 

* 

rather  than  tQ,  is  that  the  tQ  Interval  is  longer,  asymptotically, 
by  an  amount  o(Z)  z(1-o/2)/((Et)  n^*). 

A  similar  analysis  can  be  performed  for  the  Intervals  [L^(a/2) , 
R'(a/2)] .  REISER  (1943)  showed  that 

Q 
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(6.2) 


*n_j(p)  ■  *(p)  +  <*3(p)  +  *(p))/4n  +  o(n“l)  . 

Thus,  using  Che  uniformity  in  x  of  the  expansion  ?2,n(x),  w  8*6 

i 

(6.3)  eQ(a/n)'  -  *2,n(*n-l(1~a/2))  "  ,2,n<*n-l(,x/2,,  "  (l“o)  +  °<n‘l> 

“  e  (a/2)  +  (x^+x)  $(x  )/2n  +  o(n-1) 
xx  a  a  a 

where  Xq  ■  s(l-a/2).  Thus,  the  coverage  rate  for  the  Interval 

(L'(a/2),  R' (a/2)]  tends  to  be  larger  than  that  of  t  ,  by  an  amount 
u  tx  n 

3 

(x  -t*  )  $(x  )/2n.  For  highly  skewed  populations,  this  gives  intervals 

CL  CL  CL 

based  on  Student  t-quantiles  an  advantage  over  those  based  on' normal 

quantiles.  The  use  of  Student  t-quantiles  comes  at  the  cost  of  an 

3  3/2 

interval  which  is  longer  by  an  amount  o(Z)(x  +x  )/(Er)  n  ,  however. 

a  a 

Note  that  for  samples  from  populations  with  normal  Yi  and 
i l  -  l ,  en(a/2)*  -  o(n  1),  as  expected. 

7.  Johnson's  Pivotal  Transformation 

In  Section  6,  it  was  shown  that  under  reasonably  general 

* 

assumptions,  the  one-sided  coverage  errrra  for  tQ  and  tQ  are  of 
-1/2 

order  0(n  ).  These  errors  arise  due  to  asymmetry  effects  related 

to  skewness  and  ratio  estimator  bias.  In  a  recent  paper  JOHNSON 

(1978)  considered,  in  the  case  where  t  s  1,  a  transformation  of  the 

n 

pivot  tn  derived  on  the  basis  of  Cornish-Flsher  expansions  (see 
CORNISH  and  FISHER  (1937)  for  a  discussion  of  these  expansions). 
Empirical  evidence  collected  by  Johnson  indicated  that  the 
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transformation  lad  to  intarvala  that  raflected  ttaa  "corract"  dagraa  of 
asymmetry.  He  now  investigate  the  pivotal  transformation  of  Johnson, 
using  the  machinery  developed  in  Section  4. 

Consider  the  sequence 

(7.1)  T  ■  t  +  8  n“1/2  +  p  (t  )2  n‘l/2 

Q  Q  u  Q  Q 

where  8  -  0(5  ,v),  p  «  p(5  ,v  )  and  8(0,  p(0  are  functions 

Q  H  u  u  Q  a 

analytic  on  a  neighborhood  of  (p,  <r(z)).  Let  0  ■  8(|i,  Az)), 
p  ■  p(p,  <7^(Z))t  and  observe  t5-.it 

(7.2)  T  -  Z  +  (aZ  -  W  Z  /2  +  8  +  pZ*)  n  +  n  r 

n  n  '  n  n  n  xi  u 

.  r  +  o'1  xa 

\ 

where  Xu  Op(D*  ( 

f: 

He  now  use  Proposition  4.12,  Theorem  5.6,  and  relation  (5.7)  to 
obtain  the  cuaulant  expressions 

X 

, 

, 

Observe  that  by  setting  8  "  p/6,  p  ”  p/3  -  a,  all  three  cumulants  j 

—  1 

above  are  reduced  to  o(n  ).  This  suggests  setting  ®n  *  V6’ 

Pn  “Pn/3  "  ®n 


Ki<ti  n)  -  (-P/2  +  a  +  8  +  p)  n"1/2  +  0(n-1) 
K2<Tl,n>  -  1  +  °<0’l) 

k3(T1  n)  -  (-2p  +  6a  +  6p)  n~1/2  +  0(n_I)  . 
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(7.3) 


M(Tm)  -  /  f(y)  ♦(y)dy  +  oCn”1^2) 

for  all  f  €  <£(t). 

(11)  If  K(  jTnJ9  +  jru|9)  <  •,  and  If  the  density  asaoaptlon  of 
Thsorea  3.11  holds,  then 

P(Tb  <  a}  -  »<x)  +  oCu  ^ 2 ) 

uniformly  in  z. 

(ill)  Basalts  (1)  and  (11),  under  the  assnaptlons  stated,  are 
valid  for  T*. 


Proof.  First,  observe  that  %n  (see  (7.2))  Is  tha  ranalnder  tern 
froa  Taylor's  theorsa  for  Tfi.  On  ths  set  {vft  >  0,  rn  #  0>,  Xq 
the  fora 


t 

{ 

! 

I 

< 

I 
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*„  *  <V'>3 

when  Hq({q)  is  bounded  on  { <  e).  Now,  apply  Theorems 
4.8(11)  and  4.16,  as  In  the  proof  of  Theorem  5.11,  to  obtain  (11). 

For  (1),  write 

Ef(Tn)  -  Efd^)  +  Exq  Df(nQ)/n  +  o(n“1/2) 

* 

and  argue  as  In  the  proof  of  Theorem  4.16.  The  proofs  for  Tq  can  be 
handled  similarly.  I 

He  remark  that  the  moment  assumptions  in  Theorem  7.4  follow  from 

V  4 

the  fact  that  must  be  expanded  to  Include  Y^t^  with  fc+j  ■  3, 

due  to  the  presence  of  {3n  in  Ttt* 

For  the  classical  case  where  x  si,  the  transformed  pivots  T 

n  n 

* 

and  Tq  are  precisely  the  statistics  suggested  by  Johnson,  up  to  a 

term  which  Is  0p(n  *).  Note  that  Theorem  7.4  gives  rigorous 

substance  to  the  statement  that  T  (T*)  "normalises”  t  (t  )  in  the 

an  n  n 

sense  of  creating  a  r.v.  which  is  closer  to  a  normal.  This  is  not 
surprising,  in  light  of  the  fact  that  Johnson's  calculations  were 
based  on  Cornlsh-Flsher  expansions,  which  are  "normalisation"  series 
(see  (30],  p.  643). 

Theorem  7.4  can  be  easily  applied  to  coverage  error  asymptotics 
to  yield  the  following  result:  If  (Yn,rn)  satisfies  the 
assumptions  of  Theorem  7.4(11),  then  all  the  coverage  errors 
(one-sided  as  well  as  two-sided)  for ' intervals  based  on  T  or  T* 
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-1/2 

are  o(n  )  uniformly  In  the  parameter  p.  Thus,  the  Johnson 
pivotal  transformation  corrects  for  asymmetry  affects. 


8.  A  Second-Order  Pivotal  Transformation 

As  discussed  in  Section  6,  nonparanatrle  confidence  intervals  have 
a  tendency  to  undercover  at  small  sample  sizes.  However,  the  analysis 
of  the  symmetric  Intervals  (L^a/l) ,  8^ a/2)]  showed  that  the 
coverage  error  is  basically  determined  by  the  term  in  n~*  of  the 
asymptotic  expansion  of  P{ tn  <  x).  This  suggests  that  any  attempt 
to  correct  the  coverage  rate  of  the  symmetric  Intervals  [La(a/2) , 

8 n(a/2)]  must  deal  with  higher  order  error  terms  than  those 
considered  by  the  Johnson  pivotal  transformation. 

Consider  the  statistic 


(8.1) 


•  -13-1 

T  ■  T  +  v  t  n  +  «  t  n 
li  n  n  u  n  n 


where  v  ■  v(U)  +  0 (n  3^2),  u  •  w(U)  +  0  (n  3^2),  and  v(»),  u(») 
a  n  p  n  n  p 

are  functions  analytic  on  a  neighborhood  of  p.  Before  proceeding 


with  an  expansion  of  TQ,  we  stats  the  following  approximations: 


(8.2) 


-  \  -  35nY/o(Z)  +  Op(n_3/2) 
«  -  *  -  Z  5/o(Z)  +  <>  (n"3/2> 

a  a  Q  p 
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where  •  Z3/^3^),  »  Z^t^/CaCZ)  E-t).  Substituting  (8.2)  into 

(8.1)  yields  the  following  expansion  (for  the  purposes  of  calculation, 
we  take  <j(Z)  -  1): 


(8.3)  T  -  {Z  (1  -  \  (V  -  2aZ  +  Z  Q  n"1/2)  n"l/Z 
n  »  n  z  n  n  n  n 

*j  (WB  -  2«V2  n”1}  (1-n”1) 


•1/2  *  ?  A  A  .1/8  .1/8  A 

+  \n  +  pZ*(l  -  (W  -2aZ  )n  1/z)n  i/z  +  M  /6n  -  Z  y/2n 
n  n  n  n  rt 


A  AAA  A  A  — | 

+  (M/3  -  Z  v  -  *_  ♦  Z  6)  Z~  n 
a  n  n  n  a 


A  -1  A3  -1  -3/2 

+  vZ  n  +  uZ_  n  1  +  0  (n  J'z) 
n  n  p 


A»  -3/2 
T  +  a  r 

n  —  t 


Proposition  3.12  can  be  used  to  calculate  the  cumulants  of  T^: 


(8.4)  x.(t')  -  0(n"3/2) 

x  n 

k2(V  “  1  +  <35-7T"3«2+ap-p2/3&+7X/3+6+2v+6«)n'’1+  0(n'3/2) 
<3(T^)  -  0(n"3/2) 

k4(T^)  -  (SX+dp^a+lS+^P+lZS^Y-Sa^u)^1  +  0(n'3/2)  . 


Once  again,  a  judicious  choice  of  v  and  u  can  reduce  the  order  of 

—3/2 

the  above  cuaulants  to  0(n  ),  In  fact,  choosing 
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vn  "  V2  +  l3*l'72  ~  3/4  ~  V12 

u_  -  Tn  +  3a2/12  -  5/2  +  a  B  /2  -  p2/18  -  3/4  -  X.  /4 

a  n  n  u  no  n  ti 

where 


(ii)  If  +  t|*)  <  •,  end  If  (Y  ,t)  satisfies  the 

n  n  o  n 

density  eesoaptlooe  (3.10)  (111)  or  (It),  then 
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P(T  <  x}  »  ®(x)  +  o(n  1) 

n  — 

uniformly  In  x. 

A  f 

Proof.  We  apply  Theorems  4.8(iii)  and  4.16  to  Tq  (note  that  now 

fourth  moments  of  (7  )  must  be  Included  In  U  ).  The 

n  n  n 

perturbation  Xn  of  (8.3)  Is  a  Taylor  series  remainder  similar  to 
that  found  In  (7.2).  One  then  argues  as  In  the  proof  of  Theorem  7.4 
for  Tn.  II 

It  Is  interesting  to  examine  the  situation  when  sampling  from 
(7^,-c^),  where  =  1  and  7^^  is  normally  distributed.  In  this 

case, 

T”  t-  (t  +t^)/4n  +  0(n“3/2)  . 
n  n  n  n  p 

-3/2 

This,  up  a  term  of  order  Op(n  ),  is  the  Hotelllng-Frankel 

transformation,  which  was  derived  in  [15].  This  transformation  was 
designed  as  a  device  to  transform  a  Student  t-variate  with  n-1 
degrees  of  freedoms  into  a  r.v.  with  a  "more'*  normal  distribution. 

A 

Theorem  8.5  thus  shows  that  Tq  is  the  nonparametrlc  analogue  of  the 

Hotelllng-Frankel  transformation. 

Theorem  8.5  has  important  consequences  for  confidence  interval 

estimation.  In  particular,  under  assumption  (8.5)  (ii),  the  result 

proves  that  all  the  coverage  errors  (one-sided  and  two-sided)  for 
*  —1 

intervals  based  on  Tq  are  o(n  )  uniformly  in  p.  Thus,  the 
transformation  (8.1)  improves  coverage  rates,  as  well  as  corrects  for 
asymmetry  effects. 
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9.  Numerical  Results 


In  this  section,  we  report  the  results  of  a  Monte  Carlo  study  of 
the  coverage  characteristics  of  'normal  quantile"  confidence  intervals 

A 

based  on.  the  pivots  t  ,  T  ,  and  T  . 

n  a  11 

(9.1)  EXAMPLE.  Choose  in  H  1  and  let  Yn  have  an  exponential 

distribution  centered  at  0  (i.e.,  P{Yn  >  y)  ■  exp(-(y+l))  for 

y  >  -1).  This  example  was  studied  in  [9]. 

(9.2)  EXAMPLE.  Let  rn  3  1,  and  suppose  Yn  has  a  chi-square 
distribution  with  10  degrees  of  freedom.  This  example,  as  well  as 
(9.1),  was  considered  in  [19]. 

(9.3)  EXAMPLE.  Let  (Wnj  n  £  1)  be  the  sequence  of  consecutive 
customer  waiting  times  in  an  M/M/1  queue  with  arrival  intensity  \  *  5 
and  service  Intensity  p,  “  10*  The  process  {Wn}  is  then  a  Markov 
chain  which  takes  on  the  value  0  infinitely  often.  Returns  to  0 
constitute  regeneration  times  for  { Wn}  and  thus  a  sequence 
{(YffTi))  of  appropriate  regenerative  pairs  can  be  constructed,  with 
a  goal  of  estimating  EW,  the  stationary  waiting  time.  See  Iglehart 
(1971)  for  more  details  on  this  process. 

(9.4)  EXAMPLE.  Let  (Bt;  t  _>  0)  be  the  busy-time  process  obtained 
from  the  M/M/1  queue  of  Example  9.3;  i.e. ,  B^  is  1  or  0  depending 
on  whether  or  not  the  server  is  busy  at  time  t.  This  process 
regenerates  itself  at  those  Instants  at  which  a  customer  arrives  to 
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find  a  free  server.  Based  on  this  sequence  of  regeneration  tines,  a 
confidence  interval  for  the  long-run  proportion  of  tine  that  the 
server  is  busy  can  be  derived,  yielding  a  sequence  {(T^,^)}  of 
regenerative  pairs. 

For  Examples  9.1  and  9.2,  2500  replications  of  the  sampling 

experiment  were  created;  for  Examples  9.3  and  9.4,  1000  replications. 

Pseudo-random  numbers  were  obtained  from  the  Leamonth-Levls  random 

number  generator  (see  LEARMONTH  and  LEWIS  (1973)  for  a  description). 

The  goal  was  to  estimate  P{$n  z(0.05)},  P{$n  _>  z(0.95)>  and 

P{z(0.05)  <  4  <  z(0.95)>  for  $  ■  t  ,  T  ,  and  T  . 

—  n  —  n  n  n  n 

A 

Note  that  both  T  and  T  are  non-linear  in  the  parameter  r. 
n  n 

Thus  in  order  to  determine  100(l-a)Z  confidence  interval  boundaries 
based  on  these  statistics,  the  zeros  of  some  non-linear  equations 

A 

must  be  found.  Specifically,  in  the  case  of  Tq,  one  first  considers 
the  cubic  polynomial 

(9.5)  f  (x)  •  0  n  +  x(l  +  v_  n  1)  +  x2  p  n  ^2  +  x^  to  n  1  . 
n  n  n  n  n 

Given  some  fixed  e  >  0,  one  then  finds  solutions  XQ(i)  satisfying 

fn(xn(i))  -  z±  such  that  | -  xQ(i)|  <  c,  for  z^  -  z(p+l-a), 

Z2  ■  z(p).  Let  Eq  be  the  event  that  such  solutions  exist  uniquely, 

with  x  (1)  >  x  (2).  On  {v  >  0,  x  *0),  set 
n  n  n  n 

n  n 

r  -  vy2(Iv  x  (2)  +  (1-1-  )  z(p))/(n1/2  x)  . 
n  n  e_  n  b.  n 

n  n 


Ln(p)  - 


(9.6) 


Rn(p) 
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(9.7)  PROPOSITION.  (i)  If  E(T4k  +  ***)  <  -,  than  1  -  P(E) 

n  n  n 

-k/2 

-  OCn  ). 


(II)  Older  assumption  (11)  of  Theorem  8.5,  the  error  asymptotics 

of  IL  (p) ,  R.  (p) ]  are  o^"1),  uniformly  In  p. 
n  n 

(III)  If  E(T*  +  rS  <  -,  then  i  (p)-£  (p)  -  Efl(p)-L  (p^KXn"1) 

n  H  n  a  a  n 


Proof.  Because  of  the  continuity  of  vn>  pn>  and  wq  In  0Q, 

there  exists  6  such  that  IDn-pl  <  5  implies  all  four  estimators 
are  within  n  of  their  limits.  Hence,  ID^-pl  <  6  implies  that 

|fn(x)-x|  <C  K  rt  n 

(9*8)  ,  ,  -1/2 

|(Dfn)(x)  -  l|  <  Kti  n 

for  some  K,  uniformly  In  x  on  (z^-e,  z^+ej.  Thus,  for  n 

sufficiently  large,  f  is  monotone  on  [z^-e,  Sg+cl,  with 

f  (s,+e)-z.  >  e/2,  and  f  (*.-e)  -  z.  <  -e/2,  provided  ID  -pi  <  6. 
oil  nix  u 

So,  £  {IDQ-pl  <  6}  for  n  sufficiently  and  hence  1  -  P(Eq) 

<  P{IO  -p»  >  6}  <  0(n"k/2)  (see  (4.18)),  provided  E(Y*k-K*k)  <  -. 
*■—  n  “  u  u 

Relation  (9.8)  also  proves  (ill),  with  the  assistance  *f  the  strong 
law  of  large  numbers.  For  (11),  we  use  the  fact  that  Theorem  4.8 
allows  the  %'s  to  be  defined  arbitrarily  outside  a  neighborhood  of 
P»  I 
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For  the  pivots  Tq  end  Tr,  a  different  approach  is  more 
attractive.  Observe  that  the  exponential  pivots  Tq  and  Tq 
defined  by 


(9.9) 


t  r» _ 

<Pn#0}l  2pn 


(exp(~T7T  )  “  l>  +  ®n  n_1/2^ 

II 


+  '{p.-O}".  +  en  ”'1/2) 


T**  defined  similarly)  satisfy  T*  -  T  +  0  (n_1),  T*e  -  T*  +  0  (n_1). 
n  n  n  p  n  n  p 

*  *a 

The  pivots  Tq  and  Tq  are  monotone  in  the  parameter  r,  thereby 

* 

avoiding  some  of  the  complications  Inherent  in  using  T  or  T  .  An 

n  n 

argument  besed  on  Theorem  4.16  proves  that  confidence  Intervals  based 

m  Am 

on  rj  or  Tq  enjoy  the  same  error  asymptotics  as  those  for  Tq  or 
Tnt  «P  to  order  o(n  *^2). 

It  should  be  noted,  however,  that  the  coverage  estimates  in 

Examples  9.1  through  9.4  were  computed  using  the  estimators  tQ,  Tq, 

* 

and  Tq  explicitly.  In  other  words,  because  the  value  of  r  was 
known  for  each  of  the  examples,  the  three  pivots  were  explicitly 
calculated  to  determine  which  of  the  intervals  1^  “  (-•,  *(0.05)], 

Ij"  (s(O.OS),  *(0.95)],  and  Ij  "  [z(0.95,  <*)  covered  the  pivots.  In 
practice,  of  course,  one  would  have  to  explicitly  calculate  the 

A 

confidence  Interval  boundaries.  For  the  pivot  TQ,  this  would  require 

finding  roots  of  the  cubic  fn(x),  and  substituting  into  (9.6). 

-1/2 

Given  the  fact  that  f  (x)  ■  x  +  0  (n  ),  Newton's  method  for 

n  p 

root-solving  should  be  quite  well-behaved,  and  hence  the  numerical 
difficulties  Involved  in  solving  (9.5)  should  not  be  too  significant. 
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Table  1  displays  the  results  for  the  exponential  and  chi-square 
examples.  Table  2  illustrates  the  behavior  of  the  pivots  for  the 
M/M/1  queueing  process  examples.  It  should  be  pointed  out  that 
for  the  >  WQ  process,  {(Yn,TQ)}  does  not  satisfy  assumptions  (ill)  or 
(lv)  of  Theorem  (S.10),  since  ta  is  s  lattice  r.v.  in  this  case. 
However,  it  can  be  shown  that  the  other  three  examples  do  satisfy  the 
conditions  of  Theorem  5.10. 

Note  that  Examples  9.1  through  9.3  appear  to  confirm  the  error 

asymptotics  of  Sections  6,  7,  and  8.  The  pivot  Tn  tends  to 

"balance1*  the  one-sided  coverage  probabilities,  moving  them  towards 

their  correct  values  of  0.05.  This  confirms  the  asymmetry  correction 

* 

induced  by  the  Johnson  pivotal  transformation.  The  pivot  Tq  goes 
one  step  further:  It  seems  to  deal  reasonably  well  with  the  overall 
confidence  interval  coverage  rate.  In  Example  9.4,  all  three  methods 
do  well.  Such  an  outcome  is  not  surprising,  in  light  of  the  fact  that 
the  sample  skewness  0,  for  a  sample  size  of  1000,  was  only  0.01. 
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TABLE  1 


Saaple 

Size 

Exponential 

2500  replications 

Chi-square 

2500  replications 

Pivot 

Coverage  Sates 

II  12  I3 

Coverage  Sates 

II  12  I3 

t 

n 

0.210 

0.758 

0.032 

0.134 

0.812 

0.054 

5 

T 

a 

0.173 

0.793 

0.034 

0.120 

0.823 

0.057 

A 

T 

n 

0.166 

0.784 

0.050 

0.122 

0.813 

0.065 

0.154 

0.820 

0.026 

0.092 

0.866 

0.042 

10 

H 

0.106 

0.851 

0.043 

0.079 

0.871 

0.050 

0.082 

0.874 

0.044 

0.069 

0.883 

0.048 

0.136 

0.838 

0.026 

0.085 

0.872 

0.043 

15 

0.096 

0.860 

0.044 

0.068 

0.878 

0.054 

H 

0.061 

0.897 

0.042 

0.060 

0.890 

0.050 

0.121 

0.855 

0.024 

0.076 

0.883 

0.041 

20 

0.088 

0.870 

0.042 

0.059 

0.885 

0.056 

0.053 

0.908 

0.039 

0.048 

0.901 

0.051 

0.113 

0.864 

0.023 

0.072 

0.890 

0.038 

25 

0.075 

0.881 

0.044 

0.058 

0.888 

0.054 

0.040 

0.925 

0.035 

0.052 

0.902 

0.046 
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TABLE  2 
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X  \ 

i\ 

\ 

I 

f 

I 

fp 
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t 

i 
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Sample 

Size 

Waiting  Times  Wq 

1000  replications 

Busy  Time  B( 

1000  replications 

•  Pivot 

Coverage  Sates 

II  12  13 

Coverage  Bates 
l!  I2  I3 

H 

0.333 

0.649 

0.018 

0.081 

0.884 

0.035 

40 

H 

0.201 

0.756 

0.043 

0.049 

0.896 

0.055 

H 

0.065 

0.802 

0.133 

0.049 

0.898 

0.053 

B 

0.256 

0.726 

0.018 

0.067 

0.893 

0.040 

80 

0.176 

0.787 

0.037 

0.046 

0.895 

0.059 

H 

0.025 

0.891 

0.084 

0.042 

0.901 

0.057 

B 

0.242 

0.741 

0.017 

0.060 

0.883 

0.049 

120 

0.148 

0.809 

0.043 

0.052 

0.889 

0.058 

H 

0.020 

0.910 

0.070 

0.052 

0.889 

0.059 

Cn 

0.219 

0.767 

0.014 

0.372 

0.878 

0.050 

160 

Tn 

0.131 

0.833 

0.036 

0.053 

0.8S6 

0.061 

A 

T 

n 

0,018 

0.937 

0.045 

0.054 

0.885 

0.061 

*n 

0.194 

0.792 

0.014 

0.065 

0.887 

0.048 

200 

Tn 

0.117 

0.846 

0.037 

0.050 

0.888 

0.062 

A 

Tn 

0.016 

0.948 

0.036 

0.051 

0.888 

0.061 

i 

j 

! 
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TABLE  2  (cont'd) 


Semple 

Size 

Waiting  Times  Vq 

1000  replications 

Busy  Time  B( 

1000  replications 

Pivot 

Coverage  Bates 

II  I2  13 

Coverage  Bates 

II  12  13 

B 

0.164 

0.818 

0.018 

!  0.053 

0.892 

0.055 

400 

H 

0.095 

0.858 

0.047 

0.043 

0.892 

0.065 

B 

1  0.019 

0.954 

0.027 

0.043 

0.891 

0.066 

B 

0.124 

0.852 

0.024 

0.047 

0.897 

0.056 

800 

0.075 

0.873 

0.052 

0.041 

0.894 

0.0&5 

B 

0.027 

0.945 

0.028 

0.041 

0.895 

0.064 

B 

0.121 

0.844 

0.035 

0.057 

0.882 

0.061 

1200 

mm 

0.091 

0.850 

0.059 

0.053 

0.880 

0.067 

B 

0.038 

0.926 

0.036 

0.052 

0.881 

0.067 

B 

0.108 

0.857 

0.035 

0.049 

0.890 

0.061 

1600 

H 

0.076 

0.861 

0.063 

0.044 

0.886 

0.070 

0.037 

0.924 

0.039 

0.044 

0.885 

0.071 

t 

a 

0.103 

0.859 

0.038 

0.048 

0.893 

0.059 

2000 

Tn 

0.073 

0.858 

0.069 

0.043 

0.892 

0.065 

A 

Tn 

0.044 

0.911 

0.045 

0.043 

0.892 

0.065 
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